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Detailed Action 

Election/Restriction 

1. Restriction to one of the following inventions is required under 35 U.S.C. § 121: 

Invention /: Claims 1-1 1 and 12-15, drawn, respectively, to methods and apparatuses for 

creating a high-quality virtual image, as seen from a virtual viewpoint 

classified in Class 345, Subclass 955. 
Invention II: Claims 16-21, drawn to methods for using and arranging a plurality of fixed 

imagers to create a mosaic, classified in Class 348, Subclass 42. 
Invention III: Claims 22-24 and 25-27, drawn to methods for creating a local depth map of 

a scene, classified in Class 382, Subclass 154 

2. Inventions I and Invention II and Invention III are related as subcombinations disclosed as usable 
together in a single combination. The subcombinations are distinct from each other if they are shown to be 
separately usable. In the instant case, Invention I has utility separate from that of Invention II such as 
creating a high-quality virtual image, as seen from a virtual viewpoint. The creation of such an image need 
not follow the methodology of Invention II, nor would it require a configuration of imagers such as in 
Invention II. In the instant case, Invention I has utility separate from that of Invention III such as creating a 
high-quality virtual image, as seen from a virtual viewpoint. The creation of such an image would not 
require depth images derived according to Invention III. In the instant case, Invention III has utility 
separate from that of Invention II such as creating a local depth map of a scene. The creation of such a local 
depth map can be achieved through imager arrangements other than those of Invention II. See MPEP § 
806.05(d). 

3. Because these inventions are distinct for the reasons given above and have acquired a separate 
status in the art as shown by their different classification, restriction for examination purposes as indicated 
is proper. 
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Election by Telephone 

4. During a telephone conversation with Mr. Kenneth Nigon on November 1 5, 2004 a provisional 
election was made without traverse to prosecute Invention I - i.e. Claims 1-1 1 and 12-15. Affirmation of 
this election must be made by applicant in replying to this Office action. Claims 16-27 are withdrawn from 
further consideration by the examiner, 37 CFR § 1.142(b), as being drawn to a non-elected invention. 



Specification 

Objections 

5. The disclosure is objected to because of the following informalities: 

a. On page 21 of the Specification (paragraph [0102], second sentence), the word "reach" 
should be replaced with the word "reached". 

b. Equation (3) on page 23 of the Specification is miswritten. It should be expressed as: 

K (x, y, d) = 1 , 

max S„(x,y 9 d) 

(x ,y ,<J ) e 8 

6. Appropriate correction is required. 



Claims 

Objections 

7. Claim 1 objected to because of the following informalities. To clarify the language of Claim 1, it 
is suggested that the limitations expressed in the phrase "scene covered by the plurality of fixed imagers" 
be removed from its place in the current claim language and placed instead in a less grammatically 
awkward position. The word "within" in line 2 of Claim 1 should also be changed to "of. For example the 
preamble could be rephrased as: 

In a system using a plurality of fixed imagers covering a scene, a method to create a high quality virtual image, in real- 
time, as seen from a virtual viewpoint of the scene, comprising . . . 
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Appropriate correction is required. 
Rejections Under 35 U.S.C. § 102(b) 

8. The following is a quotation of the appropriate paragraphs of 35 U.S.C. § 102 that form the basis 
for the rejections under this section made in this Office action: 

(b) the invention was patented or described in a printed publication in this or a foreign country or in public use or on sale in 
this country, more than one year prior to the date of application for patent in the United States. 

9. Claims 1-2, 5, 8-12, 15 are rejected under 35 U.S.C. § 102(b) as being anticipated by 
[ChenWilliams93] (S.E. Chen and L. Williams, View Interpolation for Image Synthesis, ACM- 
SIGGRAPH, 1993). 

10. The following is in regard to Claims I and 12. [Chen Will iams93] is widely considered to be the 
seminal work in image-based rendering (1BR). [ChenWilIiams93] introduces an approach for view 
synthesis based on the linear interpolation of corresponding image points using range (depth) data to obtain 
correspondences. Generally, the method assumes a plurality (at least two) viewpoints (which, in turn, may 
imply a plurality of corresponding imagers) of a given scene (e.g. [ChenWilliams93], page 281, left 
column, paragraph 1 1). The presumption here is that, like most image-based rendering techniques, 
[ChenWilliams93] renders the interpolated scenes in real-time. 

1 1 . According to [ChenWilliams93], view interpolation (i.e. the creation of a virtual image - an image 
of the scene viewed from a virtual viewpoint) comprises the following steps: 

(1 .a.) The algorithm begins with at least two images of a scene (e.g. [ChenWilliams93], 

Section 2 View Morphing, paragraph 1, sentence 3). These can, of course, be captured by 
a set of corresponding "real" cameras. 

(1 .b.) At least two depth maps (range data of the images - e.g. [ChenWilliams93] page 280, left 



] When referring to paragraphs in the cited references, the convention followed here is that the paragraph number is assigned to 
paragraphs of a given column (if applicable) or section, sequentially, beginning with the first full paragraph. Paragraphs that 
carry over to other columns will be referred to as the last paragraph of the column in which they began. 
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column, paragraph 1, lines 4-7) are generated using ranging devices, photogrammetric 
techniques ([ChenWilliams93] page 280, left column, paragraph 2, lines 4-6), or the like. 
( Lc.) 1 . The depth maps and viewpoint information are used to recover a dense set of 

correspondences between the pixels in each pair of images (e.g. [ChenWilliams93] 
page 280, left column, paragraph 1, lines 4-7). 
2. These pair- wise pixel correspondences are used to determine a set of 3D spatial 
offset vectors ([ChenWilliams93], page 281, left column, paragraph 2, lines 4-18), 
which are then stored as a morph map 1 . 
The morph maps represent the forward mapping from one image to the other. However, because the map is 
generally many-to-one, a backward mapping must also be supplied in order to provide a complete 
representation of the pair-wise pixel correspondence. Thus, for each pair of images two morph maps must 
be provided ([ChenWilliams93], Section 2.1, Establishing Pixel Correspondence, paragraph 1, last two 
sentences). In other words, at least two sets of warp parameters (e.g. morph maps) are determined, each 
corresponding to one the input images. Again, the morph maps are determined using the depth maps (range 
data) associated with each of the given viewpoints (step (l.c.l) above). 
1 2. [Chen Williams93] further comprises the steps of: 

( 1 .d.) 1 . To generate a virtual view between a pair of images, the offset vectors of the morph 
map are interpolated linearly and the pixels in the source image are moved by the 
interpolated vector to their destinations (i.e. positions in the virtual image). See 
[ChenWilliams93] Section 2.2 Interpolating Correspondences, paragraph 1 . This 
process yields an interpolated morph map 3 . 
2. [ChenWilliams93] forward map the source image using the interpolated morph map 
(e.g. [Chen W ill iams93] Section 2.2 Interpolating Correspondences, paragraph 1 and 



2 Conceptually, the morph map is essentially the same as a disparity map ([ChenWilliams93], page 281, left column, paragraph 2, 
tines 10-1 5). It is well known in the field of computer vision that disparity is inversely proportional to the depth. Seen in this 
light, the morph map is also indicative of depth and can, therefore, be construed as a depth map. See [ChenWilliams93] Section 
2.4.2, last paragraph, sentence 2, 

3 The interpolated morph map forward maps the pixels of the source image to the interpolated image. It essentially approximates 
the perspective projection of the scene into the interpolated view. 
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[ChenWilliams93], Section 3.2 Interactive Interpolation, step 3), thereby, generating 
a warped or interpolated image representing the interpolated (virtual) viewpoint. 
3. Each of the input images can act as both a source and destination image 

([ChenWilliams93], Section 2 View Morphing, paragraph 2, lines 5-7). This process 
is repeated for each of the source images (ChenWilliams93], Section 2 View 
Morphing, paragraph 2, last sentence). 
By repeating step (1 .d.) for each of the source images, at least two warped (interpolated) images are 
obtained. Clearly, in the case of two input images, the first of these images is obtained according to the 
forward-mapping morph map and the second from the backward-mapping morph map (see step (1 .c.) 
above) 4 . Again, these morph maps are considered here as warp parameters and are respectively associated 
with the given images. 

13. For rendering purposes, the visibility of the warped pixels must be known. [ChenWilliams93] 
resolves visibility using a depth buffer (visible priority list - [ChenWilliams93] Section 2.4.2). Specifically, 
this comprise the steps of: 

(1 .e.) 1 . Compositing multiple warped input images using their associated range information 
(depth maps), by organizing the pixels (or blocks of pixel) into a fixed visibility 
order ([ChenWilliams93] Section 2.3.1, Fig. 7 and Section 2.4.2, paragraph 3). 
2. Once the visibility has been resolved for each pixel (and holes filled), the image 
corresponding to the virtual viewpoint can be properly rendered. 
Essentially, the merging occurs in the depth buffer, with each depth layers being composited (or merged) in 
front-to-back order. 

14. Image synthesis according to the teachings of [ChenWilliams93], therefore, comprises all 
substantive elements as set forth in Claim 1 . The rejection of Claim 12 follows similarly. 



Note that the designation of forward-mapping and backward- mapping is relative to which image is considered initially as the 
source. Clearly, this designation can be swapped. 
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15. The following is in regard to Claim 2. In [ChenWilliams93], as in most IBR techniques, the view 
of a virtual camera controlled by the user. See [ChenWilliams93] page 279, right column, lines 1-2. 

1 6. The following is in regard to Claim 5 and 15. The range data (depth map) associated with each of 
the input images can be obtained according a variety of different techniques. One method suggested by 
[ChenWilliams93] is to obtain the range data using ranging sensors ([ChenWilliams93] page 280, left 
column, paragraph 2, lines 4-6). Though not explicitly disclosed in [ChenWilliams93], the following are 
clearly inherent aspects of such a configuration are clearly inherent: 

(5. a.) Mounting the depth (ranging) sensors viewing the scene coincident with the fixed 

imagers. It is typically assumed that each pixel of an image is associated with a visible 
point in the three-dimensional space of a given scene. Each pixel is thus associated with a 
particular depth - the depth of the scene point. Generally, this depth is measured relative 
to the center-of-projection (COP) of the corresponding imager or viewpoint 5 . Therefore, 
if the aim is to generate range data from the COP of an imager to the viewed scene using 
depth sensors, then it is necessary that the depth sensors be mounted in close proximity 
(coincident, if feasible) to the location of the imager. 

(5.b.) Selecting at least two depth sensors corresponding to the images. 

(5x.) Measuring a plurality of depth values (this is what depth sensor do!) with the depth 

sensors. As stated above, the depth values are required for each pixel (i.e. "the plurality 
of image coordinates") of the given images to determine the aforesaid pixel-to-pixel 
correspondences. See steps (l.b.)-(l.c.) above. 

(5.d.) As stated above, a depth map (range data) is obtained for each of the input images. See 
steps (l.b.)-( I.e.) above. Clearly, in a configuration that utilizes depth sensors, these 
depth maps would consist of the measured depth values. 



5 This, of course, assumes a pinhole camera model. This assumption is made by both the Applicant and [ChenWilliams933. A 
pinhole camera is, for all intents and purposes, located at its center-of-projection. 
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It has thus been shown that an implementation of [ChenWilliams93], which utilizes ranging sensors to 
derive the range data of the given images, inherently comprises all substantive elements as set forth in 
Claim 5. The rejection of Claim 15 follows similarly. 

1 7. The following is in regard to Claims 8-11. [ChenWiIliams93] can synthesize novel views from 
images acquired at multiple viewpoints. [Chen Will iams93], therefore, supports multiple cameras (imagers). 
The authors pose no limit on the number of input images, other than there be at least two. Indeed, 

[Chen Williams93] describes view interpolation primarily within the context of a two camera/two input 
image system (e.g. [ChenWilliams93], page 281, left column, paragraph 1) - that is, a system where 
exactly two images are selected. Also, a three camera system (a system where exactly three images are 
selected) is illustrated in Fig. 7 of [ChenWilliams93]. 

1 8. Assuming a three camera system (a system where exactly three images are selected ), Fig. 7 of 
[ChenWilliams93] clearly shows exactly three images that correspond to three fixed imagers (e.g. Viewl, 
View2, and View3) arranged in a triangular fashion. This configuration is, of course, a geometric pattern of 
fixed imagers. 

Rejections Under 35 U.S.C. § 103(a) 

19. The following is a quotation of 35 U.S.C. § 103(a) which forms the basis for all obviousness 
rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 1 02 of 
this title, if the differences between the subject matter sought to be patented and the prior art are such that the subject 
matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art 
to which said subject matter pertains. Patentability shall not be negatived by the manner in which the invention was made. 

20. Claim 3 is rejected under 35 U.S.C. § 103(a) as being unpatentable over [ChenWilliams93], in 
view of [Faugeras95] (O. Faugeras et al., 3-D Reconstruction of Urban Scenes from Sequences of Images, 
INRIA, 1995). 

21. The following is in regard to Claim 3. As shown above, [ChenWilliams93] satisfies all limitations 
of Claim 1. However, [Chen Will iams93] does not disclose selecting the virtual viewpoint based on tracking 
at least one feature in the scene. 
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22. [Faugeras95] discloses a method to reconstruct a three-dimensional model of a static environment 
viewed by one or several cameras whose motions or relative positions are unknown and whose intrinsic 
parameters are also unknown and may vary ([Faugeras95], Introduction, paragraph 1). The problem solved 
by [Faugeras95], though more in the realm of image-based modeling, is nonetheless similar to that of 
[Chen Williams93]. [Faugeras95] suggests tracking a set of feature points through a given sequence of 
images. If a given feature point can be tracked all the way between two of the given views, a 
correspondence is established between those views. In this manner, a subset of the given set of images are 
used to establish feature correspondences between images. See [Faugeras95] Section 2 Robust Recovery of 
the Geometry, paragraph 1, sentence 1 and Section 2,1, paragraph 3, sentences 1-2. 

23. Given the teachings of [Faugeras95], it would have been obvious to one of ordinary skill in the art, 
at the time of the Applicant's claimed invention, to select a subset of the given images in [ChenWilliams93] 
based on whether those images contain a set of tracked feature points. The advantages of such a 
modification are (at least) twofold. First, the resultant methodology would be capable of synthesizing novel 
views of designated feature(s) in the observed scene. Secondly, correspondences (and, presumably, all 
subsequent steps) are derived only for the reserved frames. As a result, the computational burden is 
reduced. 

24. Claim 4 is rejected under 35 U.S.C. § 103(a) as being unpatentable over [ChenWilliams93], in 
view of [Trucco98] (E. Trucco and A. Verri, Introductory Techniques for Computer Vision © 1998, 
Prentice-Hall, Chapters 7-8) . 

25. The following is in regard to Claim 4. As shown above, [Chen Will iams93] satisfies all limitations 
of Claim 1. [ChenWilliams93] further suggests determining the depth maps associated with each of the 
given images using photogrammetric techniques ([ChenWilliams93] page 280, left column, paragraph 2, 
lines 4-6). Although these techniques are well-known, [ChenWilliams93] does not propose using any 
particular photogrammetric technique. 

26. Generally speaking, photogrammetry is the study in which the three-dimensional coordinates of 
points on an object are determined by measurements made in two or more photographic images taken from 
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different positions. The problem of stereo vision belongs to the field of photogrammetry. The essence of 
stereo vision lies in solving the stereo correspondence problem ([Trucco98] Section 7.1.1, paragraph 1). 

27. As shown in [Trucco98] ([Trucco98] Section 7.1.1, paragraph 2, lines 1-6), the disparity map 
represents a solution of the stereo correspondence problem, assuming the geometry of the stereo system is 
known 6 . As stated previously, disparity is inversely proportional to depth. See also [Trucco98], page 144. 
The disparity map and depth map are, therefore, trivially related. Given the suggestion of 
[ChenWilliams93] to use photogrammetry to derive the depth maps, the teachings of [Trucco98] with 
regard to such a method, and the fact that [Chen Will iams93] presupposes a priori knowledge of the 
intrinsic and extrinsic camera parameters 3 ([Trucco98] page 144: Parameters of a Stereo System), it would 
have been obvious to one of ordinary skill in the art, at the time of the Applicant's claimed invention, to 
derive the depth (disparity) maps via stereo correspondence. 

28. Under certain constraints, it can be shown that the optical flow 1 between a set of images and the 
disparity (hence, depth) are approximations of one another. To illustrate this, the notion of a motion field is 
introduced. The motion field is the two-dimensional vector field of velocities of the image points, induced 
by the relative motion between the viewing camera and the observed scene ([Trucco98], page 183). This 
relative motion may manifest itself as the viewing camera moving about a static scene. For static scenes, 
movement of the camera about the scene is equivalent to capturing the scene from a plurality of fixed 
cameras located at discrete locations along the path of the camera. The derivation of the motion field 
induced by a camera moving relative to a static scene is thus conceptually similar to the stereo 
correspondence problem for pairs of cameras fixed along the path of the moving camera. Indeed, the 
motion field coincides with the stereo disparity map when spatial and temporal differences between frames 
are sufficiently small ([Trucco98], page 185: Stereo Disparity Map and Motion Field). Returning to the 
discussion of optical flow, [Trucco98] points out that, if one assumes a globally illuminated scene of 
Lambertian (diffusive) surfaces, then optical flow is an approximation of the motion field ([Trucco98], 



6 
7 



This is key assumption in [ChenWilliams93]. See [ChenWilIiams93], Section 2.1, paragraph 1, sentence 3. 
The optical flow is defined as the apparent motion of the image brightness pattern. 
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page 195: Optical Flow and Motion Field). Taking into account the previous observations, the following 
can be concluded. If a set of input images depicts a globally illuminated scene of Lambertian (diffusive) 
surfaces, from a corresponding set of tightly spaced and spatially coherent viewpoints, then the disparity 
map and optical flow field are approximations of one another 8,9 . These observations imply that, under the 
first and second constraints, the derivation of the disparity (depth) maps, using photogrammetric methods, 
involves: 

(4.a.) Calculating a plurality of optical flow values (disparity) between the set of input images. 

29. The disparity of an image pixel is actually the parallax 10 caused by viewing the corresponding 
scene point from different viewpoints. Disparity in image pairs is often referred to as binocular parallax. 
Thus, in calculating the disparity of an image pixel, one has also calculated a parallax value associated with 
the pixel. Given this observation, the derivation of the disparity (depth) maps further includes: 

(4.b.) Calculating a plurality of parallax values (disparity) corresponding to pixels (i.e. a 
plurality of image coordinates) in the given input images from optical flow values 
(disparity). 

30. [ChenWilliams93] satisfies both the first constraint ([ChenWilliams93] page 280, left column, 
paragraph 1, sentence 1) and the second constraint ([ChenWilliams93] page 280, right column, paragraph 
1, lines 8-12). Therefore, the derivation of the depth maps implies steps (4.a.)-(4.b.) above and, thus, the 
step of: 

(4.c.) Calculating the depth (disparity) maps using the image pixels and the parallax (disparity) 
values. 

That is, steps (4.a.)-(4.c.) are implicit to the calculation of the depth maps by stereo reconstruction in the 
method of [ChenWilliams93]. 



8 Note that the images are given and can be presumed to have been captured simultaneously. In this case, the temporal difference 
between images is negligible. 

9 For the sake of brevity, the constraint of small spatial and temporal differences between frames will be referred to as the "first 
constraint"; and the "second constraint" will refer to the assumption of global illumination and Lambertian (diffusive) 
reflectivity for all scene surfaces. 

1 0 Parallax is the apparent displacement or the difference in apparent location of an object as seen from two different viewpoints 
not on a straight line with the object 
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3 1 . Claim 7 is rejected under 35 U.S.C. § 103(a) as being unpatentable over [ChenWilliams93] ) in 
view of [RoginaOl] (U.S. Patent Application Publication 2001/0043737, assigned to Rogina et al.). 

32. The following is in regard to Claim 7. As shown above, [ChenWilliams93] satisfies all limitations 
of Claim 1 . However, [Chen Williams93] does not disclose selecting the input images based on a proximity 
of the virtual viewpoint to the viewpoints corresponding to the input images. 

33. [RoginaOl] discloses a method of providing an image from an arbitrary virtual viewpoint. In that 
method, a plurality of discrete two-dimensional images are acquired, each corresponding to the image of a 
scene observed from a plurality of discrete viewpoints on a predetermined viewpoint locus ([RoginaOl] 
column 2, paragraph [001 1], sentences 1-2; see also Fig. 1). In a process analogous to [ChenWilliams93], 
[RoginaOl] uses an input viewpoint (base viewpoint) to map from transform images into the virtual 
viewpoint image ([RoginaOl] column 2, paragraph [001 1], last sentence). The base viewpoint is selected 
from the discrete viewpoint locus. According to [RoginaOl], it is desirable to selected a base viewpoint 
close to the virtual viewpoint. See [RoginaOl] column 2, paragraph [001 1], sentences 5-6. Note that 
[RoginaOl] also uses adjacent viewpoints in the view interpolation ([RoginaOl], Abstract, lines 10-14). It 
that sense, the selection of the base viewpoint entails a selection of additional adjacent view points (which 
should also be close to the virtual viewpoint) - that is, at least two proximate images are selected. 

34. It would have been obvious to one of ordinary skill in the art, at the time of the Applicant's 
claimed invention, incorporate this simple selection process into [ChenWilliams93]. According to 
[RoginaOl], selecting the viewpoints closest to the virtual viewpoint alleviates skewing and accurately 
reflects occlusions of distant objects by close objects ([RoginaOl] column 13, paragraph [0102]). 

35. Claim 6 and 14 are rejected under 35 U.S.C. § 103(a) as being unpatentable over 
[ChenWilliams93], in view of [LuoMaitre90] (W. Luo and H. Maltre, Using Surface Model to Correct and 
Fit Disparity Data in Stereo Vision, IEEE, 1990). 
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36. The following is in regard to Claim 6. As shown above, [ChenWilliams93] satisfies all limitations 
of Claim 1. However, [ChenWilliams93] does not create the aforementioned depth (disparity) maps by: 

(6.a.) Separating the given set of images into a plurality of segments, wherein pixels of each 

segment have substantially homogenous values. 
(6.b.) Calculating a depth value corresponding to each segment. 
(6.c.) Optimizing the depth values corresponding to each segment. 

(6.d.) Creating the aforementioned depth maps from the plurality of optimized depth values 

37. [LuoMaitre90] disclose a method for stereo reconstruction 11 ([LuoMaitre90] Abstract) comprising 
the steps of: 

(6 T a.) The images are segmented into regions of substantially uniform values (gray values). See 

[LuoMaitre90] Section 3.1, item (b) and Abstract, sentence 3 . 
(6.b.) The depth value (disparity) of each segment is calculated. See [LuoMaTtre90] Section 

3.1, item (a) and second to last paragraph, sentence 2. 
(6.c.) The disparities of each segment (referred, henceforth, to as the disparity map of a 

segment) are optimized by the following: 

1 . Fitting a plane to the disparity map of each segment. See [LuoMaTtre90] Section 3.1, 
second to last paragraph, sentence 4. 

2. The goodness-of-fit is determined. See [LuoMaTtre90], Section 3.1 , last paragraph. 

3. Errors are corrected ([LuoMaitre90], Section 3.1, last paragraph, last sentence and 
Section 3.2). 

4. If the fit is still unacceptable the segment is subdivided. See Section 3.3 of 
[LuoMaitre90]. 

(6.d.) If the fitted planar model is acceptable for a given segment , it is fit to the measured 

disparity map. The fitted plane then becomes the "optimized" disparity map for the given 
segment. See [LuoMaitre90] Section 3.4. This is clearly done for all segments in each of 
the input images so as to obtain a complete disparity (depth) map for each of the images. 

1 1 Recall from above that stereo reconstruction yields a disparity or depth map associated with a given image. 
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38. The primary advantage of [LuoMaitre90] is that fitted plane can provide a dense set of disparity 
values (depths) from a sparse set of measured disparities. Furthermore, as a mathematical model, the fitted 
plane has sub-pixel resolution. Taking this into account, it would have been obvious to one of ordinary skill 
in the art, at the time of the Applicant's claimed invention, to derive the depth (disparity) maps of 
[ChenWiIliams93] according to the teachings of [LuoMaitre90]. 

39. The following is in regard to Claim 14, As shown above, [ChenWilliams93] satisfies all 
limitations of Claim 12. As just discussed with respect to Claim 6, [LuoMaitre90] is a segmentation-based 
method for disparity (depth) calculation. Note that the brightness value is never used in [LuoMaTtre90], 
aside from its use in evaluating the homogeneity of image regions. Therefore, it would have been obvious 
to one of ordinary skill in the art, at the time of the Applicant's claimed invention, to combine 
[LuoMaitre90] and [ChenWiIliams93], in the manner suggested above, and further extend [LuoMaitre90] 
to accommodate color images. 

40. Claim 13 is rejected under 35 U.S.C. § 103(a) as being unpatentable over [ChenWilliams93], in 
view of [Saito99] (H. Saito et al., Appearance-Based Virtual View Generation of Temporally-Varying 
Events from Multi-Camera Images in the 3D Room, IEEE, 1999). 

41 . The following is in regard to Claim 13. As shown above, [ChenWilliams93] satisfies all 
limitations of Claim 12. [ChenWilliams93], however, does not disclose using a view-based volumetric 
mapping means to create depth maps of the images. 

42. [Saito99] proposes an "appearance [view] -based" virtual view generation method ([Saito99] 
Abstract). Depth images are derived for each camera using a multi-baseline stereo methodology ([Saito99] 
Section 4. 1 , paragraph 1 ). These depth images are merged to form a three-dimensional volumetric model 
([Saito99] Section 4.1, paragraph 2). Using the volumetric model to resolve occlusions, [Saito99] derive a 
disparity (depth) map for each of the input views ([Saito99] Section 4.2 and Fig. 7). Clearly, in this sense, 
[Saito99] represents a view-based volumetric mapping means for creating depth (disparity) maps. 

43. This volumetric process is superior because it successfully resolves occluded regions in all of the 
given views ([Saito99], Abstract, sentences 4-5). Therefore, it would have been obvious to one of ordinary 
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skill in the art, at the time of the Applicant's claimed invention, to use the method of [Saito99] to create 
depth images for each of the input images of [ChenWilliams93]. 
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Citation of Relevant Prior Art 

44. The prior art made of record and not relied upon is considered pertinent to applicant's disclosure: 

45. [Sawh3D94], [SawhSMM94], and [Kumar94] all relate recovery of the 3D geometry of an scene, 
observed at multiple viewpoints, using planar parallax. These may provide some theoretical insight into the 
Applicant's disclosed methods. 

[Sawh3D94] H. Sawhney, 3D Geometry from Planar Parallax. IEEE, 1994. 

[SawhSMM94] H. Sawhney, Simplifying Multiple Motion and Structure Analysis Using Planar 

Parallax and Image Warping. IEEE, 1994. 
[Kumar94] R. Kumar, P. Anandan, and K. Hanna, Shape Recovery from Multiple Views: A 

Parallax Based Approach. Sarnoff Technical Report, 1994. 



46. Various important IBR publications. Given the broadness of the current claim language, several of 
these methods "read on" the Applicant's claimed invention. 

[LavFaug94] S. Laveau and O. Faugeras, 3-D Scene Representation as a Collection of 

Images. IEEE, 1994. 

[SeitzDyer96] S. Seitz and C. Dyer, View Morphing. ACM-SIGGRAPH, 1996. 

[Gortler96] S. Gortler et al., The Lumigraph. International Conference on Computer 

Graphics and Interactive Techniques: Proceedings of the 23 rd annual 
conference on Computer graphics and interactive techniques, 1994. 

[Shade98] J. Shade et al. Layered Depth Images, ACM-SIGGRAPH 1998. 

[McMillan95] L. McMillan and G. Bishop, Plenoptic Modeling: An Image Based Rendering 
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System. ACM-SIGGRAPH, 1995. 

47. [Mark97], like [Saito99] above, discloses several aspects of the Applicant's claimed invention, 
including the derivation of the depth maps for each input image and the warping and compositing of each 
input image to form a virtual image. 

[Mark97] W. Mark, L. McMillan, G. Bishop, Post-Rendering 3D Warping. ACM- 

SIGGRAPH, 1997. 

V 

48. The following are I BR methods for recovering depth and/or synthesizing new views. All involve 
some form of interpolation, warping, or morphing of the given images. 

[GeogievOl] T. Geogiev, U.S. Patent 6,268,846. Filing Date: June 1998. 

[Sato02] K. Sato, U.S. Patent 6,445,815. Filing Date: March 1999. 

[Endo02] T. Endo et al. US. Patent Application Publication 2002/0171666. Filing 

Date: August 1999. 

49. Other "color-segmentation depth calculation means". 

[Moravec99] K. Moravec et al., Using an Image Tree to Assist Stereo Matching. IEEE, 

1999. 

[BlackJep96] M. Black and A. Jepson, Estimating Optical Flow in Segmented Images 

Using Variable-Order Parametric Models with Local Deformation. IEEE, 
1996. 



50. Publications by the Inventors related to the claimed invention. 

[TaoSawhGMCOl] H. Tao and H, Sawhney. Global Matching Criterion and Color Segmentation 

Based Stereo. SarnofT Technical Paper, December 2000. 
[SawhGuoOl ] H. Sawhney et al. Hybrid Stereo Camera: An IBR Approach for Synthesis of 

Very High Resolution Stereoscopic Image Sequences. ACM-SIGGRAPH, 
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August 2001. 

[TaoSawhGMMOl] H. Tao, H. Sawhney and R. Kumar. A Global Matching Framework IEEE, 
July 2001. 



Any inquiry concerning this communication or earlier communications from the examiner should 
be directed to Kevin Siangchin whose telephone number is (703)305-7569. The examiner can normally be 
reached on 9:00am - 5:30pm, Monday - Friday. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, 
Amelia Au can be reached on (703)308-6604. The fax phone number for the organization where this 
application or proceeding is assigned is 703-872-9306. 

Information regarding the status of an application may be obtained from the Patent Application 
Information Retrieval (PAIR) system. Status information for published applications may be obtained from 
either Private PAIR or Public PAIR. Status information for unpublished applications is available through 
Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 
866-217-9197 (toll-free). 
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